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1. Introduction 

The Grothendieck inequality asserts that there exists a universal constant K G (0, oo) 
such that for every m, n G N and every m x n matrix A = (aij) with real entries we have 



max 



m n 

=1 i=i 



I i=i j=i 

Here, and in what follows, the standard scalar product on M} is denoted (x, y) = Yl^=i ^iVi 
and the Euclidean sphere in M'^ is denoted S^~^ = {a; G M'^ : X]j=i ~ -'-}• refer 
to [Ml [56] for the simplest known proofs of the Grothendieck inequality; see Section |2.2| for 
a proof of ([T]) yielding the best known bound on K. Grothendieck proved the inequality ([T]) 
in [l5], though it was stated there in a different, but equivalent, form. The formulation of 
the Grothendieck inequality appearing in ([T]) is due to Lindenstrauss and Pelczyhski [83] . 

The Grothendieck inequality is of major importance to several areas, ranging from Banach 
space theory to C* algebras and quantum information theory. We will not attempt to indicate 
here this wide range of applications of Q , and refer instead to [H31 CHI HDHl ESI EH El HSl [D 
Sni [Ml I102[ llUlj and the references therein. The purpose of this survey is to focus solely on 
applications of the Grothendieck inequality and its variants to combinatorial optimization, 
and to explain their connections to computational complexity. 

The infimum over those K G (0, oo) for which ([!]) holds for all m, n G N and all m x n 
matrices A = {aij) is called the Grothendieck constant, and is denoted Kg- Evaluating the 
exact value of Kg remains a long-standing open problem, posed by Grothendieck in ^5] . 
In fact, even the second digit of Kq is currently unknown, though clearly this is of lesser 
importance than the issue of understanding the structure of matrices A and spherical config- 
urations {xi}^-^^, C 5'"+'»-i which make the inequality ([T]) "most difficult". Following 
a series of investigations jlHl [HSl 11071 [771 [78] . the best known upper bound [21] on Kg is 

Kg < , = 1.782..., (2) 

2 log (1 - 



and the best known lower bound [105j on Kg is 



Kg > ^e"" = 1.676..., (3) 



where rjo = 0.25573... is the unique solution of the equation 



vr ./o TT 



In [104] the problem of estimating Kg up to an additive error of e G (0, 1) was reduced to an 
optimization over a compact space, and by exhaustive search over an appropriate net it was 
shown that there exists an algorithm that computes Kg up to an additive error of e G (0, 1) 
in time exp(exp(0(l/£:^))). It does not seem likely that this approach can yield computer 
assisted proofs of estimates such as (|2| and ([3]), though to the best of our knowledge this 
has not been attempted. 
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In the above discussion we focused on the classical Grothendieck inequality ([T]). However, 
the literature contains several variants and extensions of ([T| that have been introduced for 
various purposes and applications in the decades following Grothendieck's original work. 
In this survey we describe some of these variants, emphasizing relatively recent develop- 
ments that yielded Grothendieck-type inequalities that are a useful tool in the design of 
polynomial time algorithms for computing approximate solutions of computationally hard 
optimization problems. In doing so, we omit some important topics, including applications 
of the Grothendieck inequality to communication complexity and quantum information the- 
ory. While these research directions can be viewed as dealing with a type of optimization 
problem, they are of a different nature than the applications described here, which belong to 
classical optimization theory. Connections to communication complexity have already been 
covered in the survey of Lee and Shraibman [81j; we refer in addition to [HH EHl EH ES] 
for more information on this topic. An explanation of the relation of the Grothendieck 
inequality to quantum mechanics is contained in Section 19 of Pisier's survey |101] . the 
pioneering work in this direction being that of Tsirelson [114j. An investigation of these 
questions from a computational complexity point of view was initiated in ^J, where it was 
shown, for example, how to obtain a polynomial time algorithm for computing the entan- 
gled value of an XOR game based on Tsirelson's work. We hope that the developments 
surrounding applications of the Grothendieck inequality in quantum information theory will 
eventually be surveyed separately by experts in this area. Interested readers are referred 
to dH [2H1 E El EHl IMl ED 1221 EHl Eni M IM] • Perhaps the most influential variants 
of the Grothendieck inequality are its noncommutative generalizations. The noncommuta- 
tive versions in [991 H9] were conjectured by Grothendieck himself |15]; additional extensions 
to operator spaces are extensively discussed in Pisier's survey |101] . We will not describe 
these developments here, even though we believe that they might have applications to op- 
timization theory. Finally, multi-linear extensions of the Grothendieck inequality have also 
been investigated in the literature; see for example |115[ I112[ [201 HOSj and especially Blei's 
book [19]. We will not cover this research direction since its relation to classical combinato- 
rial optimization has not (yet?) been established, though there are recent investigations of 
multi-linear Grothendieck inequalities in the context of quantum information theory [50] . 

Being a mainstay of functional analysis, the Grothendieck inequality might attract to 
this survey readers who are not familiar with approximation algorithms and computational 
complexity. We wish to encourage such readers to persist beyond this introduction so that 
they will be exposed to, and hopefully eventually contribute to, the use of analytic tools in 
combinatorial optimization. For this reason we include Sections lA , 1.2| below; two very basic 
introductory sections intended to quickly provide background on computational complexity 
and convex programming for non-experts. 



1.1. Assumptions from computational complexity. At present there are few uncondi- 
tional results on the limitations of polynomial time computation. The standard practice in 
this field is to frame an impossibility result in computational complexity by asserting that 
the polynomial time solvability of a certain algorithmic task would contradict a benchmark 
hypothesis. We briefly describe below two key hypotheses of this type. 

A graph G = (y,E) is 3-colorable if there exists a partition {Ci,C2,C3} of V such that 
for every i G {1,2,3} and u,v G Ci we have {u,v} ^ E. The P ^ NP hypothesis as- 
serts that there is no polynomial time algorithm that takes an ra-vertex graph as input and 
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determines whether or not it is 3-colorable. We are doing an injustice to this important 
question by stating it this way, since it has many far-reaching equivalent formulations. We 
refer to |3S1 11U81 EI] for more information, but for non-experts it suffices to keep the above 
simple formulation in mind. 

When we say that assuming P ^ NP no polynomial time algorithm can perform a certain 
task T (e.g., evaluating the maximum of a certain function up to a predetermined error) we 
mean that given an algorithm ALG that performs the task T one can design an algorithm 
ALG' that determines whether or not any input graph is 3-colorable while making at most 
polynomially many calls to the algorithm ALG, with at most polynomially many additional 
Turing machine steps. Thus, if ALG were a polynomial time algorithm then the same would 
be true for ALG', contradicting the P ^ NP hypothesis. Such results are called hardness 
results. The message that non-experts should keep in mind is that a hardness result is 
nothing more than the design of a new algorithm for 3-colorability, and if one accepts the 
P 7^ NP hypothesis then it implies that there must exist inputs on which ALG takes super- 
polynomial time to terminate. 

The Unique Games Conjecture (UGC) asserts that for every e G (0, 1) there exists a prime 
p = p{e) G N such that no polynomial time algorithm can perform the following task. The 
input is a system of m linear equations in n variables xi, . . . ,Xn, each of which has the form 
Xi — Xj = Cij mod p (thus the input is S C {1, . . . ,n} x {1, . . . ,n} and {cij}{i,j)e5 ^ 
The algorithm must determine whether there exists an assignment of an integer value to 
each variable Xj such that at least (1 — e)m of the equations are satisfied, or whether no 
assignment of such values can satisfy more than em of the equations. If neither of these 
possibilities occur, then an arbitrary output is allowed. 

As in the case oi P ^ NP, saying that assuming the UGC no polynomial time algorithm 
can perform a certain task T is the same as designing a polynomial time algorithm that 
solves the above linear equations problem while making at most polynomially many calls to 
a "black box" that can perform the task T. The UGC was introduced in [62], though the 
above formulation of it, which is equivalent to the original one, is due to [M]. The use of 
the UGC as a hardness hypothesis has become popular over the past decade; we refer to the 
survey [63j for more information on this topic. 

To simplify matters (while describing all the essential ideas), we allow polynomial time 
algorithms to be randomized. Most (if not all) of the algorithms described here can be turned 
into deterministic algorithms, and corresponding hardness results can be stated equally well 
in the context randomized or deterministic algorithms. We will ignore these distinctions, 
even though they are important. Moreover, it is widely believed that in our context these 
distinctions do not exist, i.e., randomness does not add computational power to polynomial 
time algorithms; see for example the discussion of the NP ^ BPP hypothesis in [11]. 

1.2. Convex and semidefinite programming. An important paradigm of optimization 
theory is that one can efficiently optimize linear functionals over compact convex sets that 
have a "membership oracle". A detailed exposition of this statement is contained in [16], 
but for the sake of completeness we now quote the precise formulation of the results that 
will be used in this article. 

Let K C M" be a compact convex set. We are also given a point z G Q" and two radii 
r,Re (0,cx))nQ such that B{z,r) C K C B{z,R), where B{z,t) = {x eW : \\x-z\\2 ^ t}. 
In what follows, stating that an algorithm is polynomial means that we allow the running time 
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to grow at most polynomially in the number of bits required to represent the data {z,r,R). 
Thus, if, say, z = 0, r = 2~" and i? = 2" then the running time will be polynomial in the 
dimension n. Assume that there exists an algorithm ALG with the following properties. The 
input of ALG is a vector y G Q" and e G (0, 1) HQ. The running time of ALG is polynomial 
in n and the number of bits required to represent the data (e, y). The output of ALG is the 
assertion that either the distance of y from K is at most e, or that the distance of y from 
the complement of K is at most e. Then there exists an algorithm ALG' that takes as input 
a vector c = (ci, . . . , c„) G and e G (0, 1) fl Q and outputs a vector y = {yi, . . . , yn) G M" 
that is at distance at most e from K and for every x = {xi, . . . , Xn) G K that is at distance 
greater than e from the complement of K we have XliLi ^j^* ^ Y17=i ^i^i ~ ^- '^^^ running 
time of ALG' is allowed to grow at most polynomially in n and the number of bits required 
to represent the data (2;, r, i?, c, e). This important result is due to [SZ]; we refer to [46j for 
an excellent account of this theory. 

The above statement is a key tool in optimization, as it yields a polynomial time method 
to compute the maximum of linear functionals on a given convex body with arbitrarily 
good precision. We note the following special case of this method, known as semidefinite 
programming. Assume that n = k"^ and think of M" as the space of all x matrices. Assume 
that we are given a compact convex set K C M" that satisfies the above assumptions, and that 
for a given fcx matrix (cjj) we wish to compute in polynomial time (up to a specified additive 
error) the maximum of X]i=i X]j=i (^ij^ij ^'^^'^ the set of symmetric positive semidefinite 
matrices {xij) that belong to K. This can indeed be done, since determining whether a given 
symmetric matrix is (approximately) positive semidefinite is an eignevalue computation and 
hence can be performed in polynomial time. The use of semidefinite programming to design 
approximation algorithms is by now a deep theory of fundamental importance to several 
areas of theoretical computer science. The Goemans- Williamson MAX-CUT algorithm [42j 
was a key breakthrough in this context. It is safe to say that after the discovery of this 
algorithm the field of approximation algorithms was transformed, and many subsequent 
results, including those presented in the present article, can be described as attempts to 
mimic the success of the Goemans- Williamson approach in other contexts. 



2. Applications of the classical Grothendieck inequality 

The classical Grothendieck inequality ([T]) has applications to algorithmic questions of 
central interest. These applications will be described here in some detail. In Section |2 



we 



discuss the cut norm estimation problem, whose relation to the Grothendieck inequality was 
first noted in [H]. This is a generic combinatorial optimization problem that contains well- 
studied questions as subproblems. Examples of its usefulness are presented in Sections |2. 1.1 



2.1.2 


2.1.3 


2.1.4 


Section 


2.2 



method behind the proof of the best known upper bound on the Grothendieck constant. 



2.1. Cut norm estimation. Let A 

norm of A is defined as follows 



be an m X n matrix with real entries. The cut 



\A\ 



cut 



max 

5C{l,...,m} 
TC{l,...,n} 



(4) 
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We will now explain how the Grothendieck inequality can be used to obtain a polynomial 
time algorithm for the following problem. The input is an m x n matrix A = (aij) with real 
entries, and the goal of the algorithm is to output in polynomial time a number a that is 
guaranteed to satisfy 

\\A\\cut ^ « ^ C\\A\\,ut, (5) 
where C is a (hopefully not too large) universal constant. A closely related algorithmic goal 
is to output in polynomial time two subsets 5*0 C {1, ... , m} and Tq C {1, . . . , n} satisfying 



1 



(6) 



The link to the Grothendieck inequality is made via two simple transformations. Firstly, 
define an {m + 1) x [n + 1) matrix B = {bij) as follows. 



/ an 

a2i 



B 



ai2 

022 



En 
k=l '^2A; 



En 
k=l ^"mk 

En ir^m , 
k=l 1^1=1 ^tkj 



(7) 



\- 2^£=i da - l^i=x ^£2 ■■ ■ - 2^i=i a 
Observe that 

1 1 ^ 1 1 cut 1 1 ^ 1 1 cut ■ 

Indeed, for every S C {1, . . . , m + 1} and T C {1, . . . , n + 1} define S* C {1 
T*C {!,..., n} by 



,m} and 



S* 



S if m + 1 ^ ^, 

{1, . . . , m} \ 5* if m + 1 E S, 



and T* 



T iin + l^T, 

{!,..., n}\T ifn + leT. 



One checks that for all S C {1, . . . , m + 1} and T C {1, . . . , n + 1} we have 



implying ([8|. We next claim that 



151 



cut 



4' 



151 



OO— >1 ) 



where 



' m+l n+1 



\B\ 



oo— >1 



(9) 



(10) 



I i=l j=l J 
To explain this notation observe that ||i?||oo^i is the norm of B when viewed as a linear 
operator from £^ to i"^. Here, and in what follows, for p G [l,c>o] and G N the space 



is M.^ equipped with the norm 



IP) 



where ||x||^ = X]^=i k^l^ ^^r x = (xi, . . . , Xk) G 



(for p = oo we set as usual ||a;||oo = niaxjg{i . „} |xj|). Though it is important, this operator 



theoretic interpretation of the quantity 
may be harmlessly ignored at first reading. 



^1 will not have any role in this survey, so it 
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The proof of Q is simple: for {ei}T=i\ {^iYjtl ^ {-1, 1} define 5" C {1, . . . , m + 1} 
and T+,T- C {1, . . . , n + 1} by setting = {i e {1, . . . ,m + 1} : = ±1} and = 
{j e {!,..., n + : (5j = ±l}. Then 

m+l n+1 

«=i i=i ie5+ iGS- iG5+ je5- 

jeT+ jeT- isT- jGT+ 

This shows that ||i?||oo-s>i ^ 4||i?||cMj (for any matrix 5, actually, not just the specific choice 
in ([T]); we will use this observation later, in Section 2.1.3). In the reverse direction, given 
5 C {1, . . . , m + l} and T C {1, . . . , n + 1} define for i G {1, . . . , m + l} and j G {1, . . . , n+l}, 

1 and 'f^"^^- 



-1 if i ^ S, ^ \ -1 if i i T. 

Then, since the sum of each row and each column of B vanishes, 

m+l n+l ]^ _j_ ^ . m+l n+1 ^ 

ieS i=l i=l i=l j=l 

This completes the proof of ([9]). We summarize the above simple transformations in the 
following lemma. 

Lemma 2.1. Let A = (aij) be an m x n matrix with real entries and let B = (bij) be the 
(m + 1) X (n + 1) matrix given in ([T]). Then 

II4II -lllRll 

ll^llcMt ^ II II OO— >1 • 



A consequence of Lemma 2.1 is that the problem of approximating ||A||cjit in polynomial 
time is equivalent to the problem of approximating ||A||oo->i in polynomial time in the sense 
that any algorithm for one of these problems can be used to obtain an algorithm for the other 
problem with the same running time (up to constant factors) and the same (multiplicative) 
approximation guarantee. 

Given an m x n matrix A = {a.ij) consider the following quantity. 

{m n 
a.Ax^, y,) : {x.}r=i, {VAU ^ S-+"^-' . (12) 
i=i j=i J 



The maximization problem in (12) falls into the framework of semidefinite programming 
as discussed in Section 1.2 Therefore SDP(A) can be computed in polynomial time with 
arbitrarily good precision. It is clear that SDP(y4) ^ ||^||oo^i; because the maximum in (12) 
is over a bigger set than the maximum in (10). The Grothendieck inequality says that 
SDP(A) ^ Kg\\A\\^^i, so we have 

II^IU^i ^ SDP(A) ^ KgPIIoo^i. 

Thus, the polynomial time algorithm that outputs the number SDP(A) is guaranteed to be 
within a factor of Kq of ||y4||oo-s>i- By Lemma [2.1| the algorithm that outputs the number 
a = lSDP(5), where the matrix B is as in Q, satisfies (g with C = Kq. 

Section [7] is devoted to algorithmic impossibility results. But, it is worthwhile to make 
at this juncture two comments regarding hardness of approximation. First of all, unless 
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P = NP, we need to introduce an error C > 1 in our requirement ([s]). This was observed 
in [8]: the classical MAXCUT problem from algorithmic graph theory was shown in [8] to 
be a special case of the problem of computing ||yl||cut, and therefore by ^51j we know that 
unless P = NP there does not exist a polynomial time algorithm that outputs a number a 
satisfying ([s]) with C strictly smaller than j|. In fact, by a reduction to the MAX DICUT 
problem one can show that C must be at least j|, unless P = NP; we refer to Section [7] 
and [8] for more information on this topic. 

Another (more striking) algorithmic impossibility result is based on the Unique Games 
Conjecture (UGC). Clearly the above algorithm cannot yield an approximation guarantee 
strictly smaller than Kq (this is the definition of Kg)- In fact, it was shown in |104] that 
unless the UGC is false, for every e G (0, 1) any polynomial time algorithm for estimating 
1 1 A 1 1 cut whatsoever, and not only the specific algorithm described above, must make an 
error of at least Kg — e on some input matrix A. Thus, if we assume the UGC then the 
classical Grothendieck constant has a complexity theoretic interpretation: it equals the best 
approximation ratio of polynomial time algorithms for the cut norm problem. Note that |1U4] 
manages to prove this statement despite the fact that the value of Kg is unknown. 

We have thus far ignored the issue of finding in polynomial time the subsets So,Tq 
satisfying ([6]), i.e., we only explained how the Grothendieck inequality can be used for 
polynomial time estimation of the quantity ||^||cut without actually finding efficiently sub- 
sets at which ||v4|| cut is approximately attained. In order to do this we cannot use the 
Grothendieck inequality as a black box: we need to look into its proof and argue that it 
yields a polynomial time procedure that converts vectors {xi}^^, {yj}^^-^^ C 5'"+»»-i into 
signs {ei}^i, {6j}'j^i C { — 1, 1} (this is known as a rounding procedure). It is indeed pos- 



sible to do so, as explained in Section |2.2[ We postpone the explanation of the rounding 
procedure that hides behind the Grothendieck inequality in order to first give examples why 
one might want to efficiently compute the cut norm of a matrix. 



2.1.1. Szemeredi partitions. The Szemeredi regularity lemma |lllj (see also [72]) is a general 
and very useful structure theorem for graphs, asserting (informally) that any graph can be 
partitioned into a controlled number of pieces that interact with each other in a pseudo- 
random way. The Grothendieck inequality, via the cut norm estimation algorithm, yields a 
polynomial time algorithm that, when given a graph G = {V, E) as input, outputs a partition 
of V that satisfies the conclusion of the Szemeredi regularity lemma. 

To make the above statements formal, we need to recall some definitions. Let G = {V, E) 
be a graph. For every disjoint X,Y ^ V denote the number of edges joining X and Y by 
e{X,Y) = \{{u,v) E X X Y : {u,v} G E}\. Let X,Y C V he disjoint and nonempty, and 
fix £, 5 G (0, 1). The pair of vertex sets (X, Y) is called {e, 5)-regular if for every 5* C X and 
T CY that are not too small, the quantity j^!^ (the density of edges between S and T) is 
essentially independent of the pair {S, T) itself. Formally, we require that for every 5* C X 
with 15*1 ^ S\X\ and every T C y with |T| ^ 6\Y\ we have 



e{S,T) e{X,Y) 
\S\-\T\ \X\-\Y\ 



< e. (13) 
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The almost uniformity of the numbers j^p]^ as exhibited in (13) says that the pair (X, Y) is 



"pseudo-random", i.e., it is similar to a random bipartite graph where each {x,y) ^ X x Y 
is joined by an edge independently with probability ^xH^j' 



N there exists 
(V, E) can be 



is (e, 5)-regular 



The Szemeredi regularity lemma says that for all 6,6,7] G (0,1) and k ( 
K = K{e, 6, r],k) G N such that for all n G N any n-vertex graph G = 
partitioned into m-sets Si, ... , Sm ^ V with the following properties 

• k ^ m ^ K, 

• l^il — \Sj\ ^ 1 for all i,j G {1, . . . , m}, 

• the number of i,j G {1, . . . , m} with i < j such that the pair {Si, Sj] 
is at least (1 — r]) (™). 

Thus every graph is almost a superposition of a bounded number of pseudo-random graphs, 
the key point being that K is independent of n and the specific combinatorial structure of 
the graph in question. 

It would be of interest to have a way to produce a Szemeredi partition in polynomial time 
with K independent of n (this is a good example of an approximation algorithm: one might 
care to find such a partition into the minimum possible number of pieces, but producing any 
partition into boundedly many pieces is already a significant achievement). Such a polyno- 
mial time algorithm was designed in [5] (see also [73j). We refer to [3 [73] for applications 
of algorithms for constructing Szemeredi partitions, and to [5] for a discussion of the com- 
putational complexity of this algorithmic task. We shall now explain how the Grothendieck 
inequality yields a different approach to this problem, which has some advantages over |5l [73] 
that will be described later. The argument below is due to [8]. 

Assume that X, Y are disjoint n-point subsets of a graph G = {V, E). How can we deter- 
mine in polynomial time whether or not the pair {X, Y) is close to being {e, 5)-regular? It 
turns out that this is the main "bottleneck" towards our goal to construct Szemeredi parti- 



tions in polynomial time. To this end consider the following nxn matrix A 



"xy 



1 - 



"xy 



e{X,Y) 



if {x,y} G E, 
if {x,y} i E. 



(14) 



By the definition oi A, \i S ^ X and T C y then 



xdS 



"xy 



\S\-\T\- 



e{S,T) e{X,Y) 



\S\-\T\ 



\X\ ■ \Y\ 



(15) 



Hence if {X,Y) is not (e, 5)-regular then ||v4||c«t ^ eS^n^ . The approximate cut norm al- 
gorithm based on the Grothendieck inequality, together with the rounding procedure in 



Section 2.2, finds in polynomial time subsets 5* C X and T C y such that 



min < n\S\,n\r\,r? 



e{S,T) eiX,Y) 



\S\-\T\ 



\X\ ■ \Y\ 



xes 

y&T 



"xy 



Kg 2 



This establishes the following lemma. 
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Lemma 2.2. There exists a polynomial time algorithm that takes as input two disjoint n- 
point subsets X,Y of a graph, and either decides that {X, Y) is {e, 6) -regular or finds S ^ X 
and TOY with 



\S\,\T\^^eS^n and 



eiS,T) e(X,y) 



1^1 -ITI 



1X1 ■ IFI 



2 



From Lemma 2.2 it is quite simple to design a polynomial algorithm that constructs a 



Szemeredi partition with bounded cardinality; compare Lemma 2.2 to Corollary 3.3 in [5J 
and Theorem 1.5 in [73]. We will not explain this deduction here since it is identical to 



the argument in [S]. We note that the quantitative bounds in Lemma 2.2 improve over the 
corresponding bounds in [3 [T5| yielding, say, when e = S = r], an algorithm with the best 
known bound on if as a function of e (this bound is nevertheless still huge, as must be the 
case due to [B]; see also [30] )• See [8] for a precise statement of these bounds. In addition, 
the algorithms of |5l[73] worked only in the "dense case", i.e., when ||y4||cMi, for A as in (14), 
is of order n^, while the above algorithm does not have this requirement. This observation 
can be used to design the only known polynomial time algorithm for sparse versions of the 
Szemeredi regularity lemma [3] (see also ^41]). We will not discuss the sparse version of the 
regularity lemma here, and refer instead to [TU [72] for a discussion of this topic. We also 
refer to |1] for additional applications of the Grothendieck inequality in sparse settings. 

2.1.2. Frieze-Kannan matrix decomposition. The cut norm estimation problem was origi- 
nally raised in the work of Frieze and Kannan [38] which introduced a method to design 
polynomial time approximation schemes for dense constraint satisfaction problems. The key 
tool for this purpose is a decomposition theorem for matrices that we now describe. 

An m X n matrix D = (dij) is called a cut matrix if there exist subsets S C {1, . . . ,m} 
and T C {1, . . . , n}, and d G M such that for all (i, j) G {1, . . . , m} x {1, . . . , n} we have. 



di 



d 




if 
if 



;z,j)g5xT, 
S3) is XT. 



(16) 



Denote the matrix D defined in (16) by CUT(S,T, d). In [38] it is proved that for every 
e > there exists an integer s = 0(l/£:^) such that for any m x n matrix A = (aij) with 
entries bounded in absolute value by 1, there are cut matrices Di, . . . ,Ds satisfying 



A 



k=l 



< emn. 



(17) 



cut 



Moreover, these cut matrices Di, . . . , Ds can be found in time C(e)(mn)'^*^^\ We shall now 



explain how this is done using the cut norm approximation algorithm of Section 2.1 
The argument is iterative. Set Aq = A, and assuming that the cut matrices Di, 



have already been defined write Aj. 



V)) = ^—^^=1 -Dfe. We are done if 



lent ^ ^^^^ 



SO we may assume that ||^r|lcut > emn. By the cut norm approximation algorithm we can 
find in polynomial time C {1, . . . , m} and T C {1, . . . , n} satisfying 



ier 



aij{r) 



^ c Ar 



I cut 



^ cemn. 
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where c > is a universal constant. Set 



Define -Dr+i = CUT{S, T, d) and A^+i = {aij{r +1)) = A,. — Dr+i- Then by expanding the 
squares we have, 

2 



1 



151 • ITI 



EE 

i=l j=l 



/ \2 2 2 

ajj(rj — c e mn. 



1=1 j=l i=l j=l 

It follows inductively that if we can carry out this procedure r times then 



o^EE«^^w'^EE 

i=l j=l i=l j=l 

where we used the assumption that \aij\ ^ 1. Therefore the above iteration must terminate 



after [l/(c e )] steps, yielding (17). We note that the bound s = 0{l/e ) in (17) cannot be 



improved [6j; see also [HHl 130] for related lower bounds. 



The key step in the above algorithm was finding sets 5, T as in (18). In [38] an algorithm 



was designed that, given an m x n matrix A = (aij) and £ > as input, produces in time 
2i/£°(^) ^j^^^o{i) g^i^gg^g 5* C {1, . . . ,m} and T C {1, . . . ,n} satisfying 



> \\A\ 



cut 



emn. 



The additive approximation guarantee in (|19|) implies (|18|) only if 



(19) 



cut ^ s{c+ l)mn, and 
Thus the Kannan-Frieze 



similarly the running time is not polynomial if, say, e 
method is relevant only to "dense" instances, while the cut norm algorithm based on the 
Grothendieck inequality applies equally well for all values of This fact, combined 

with more work (and, necessarily, additional assumptions on the matrix A), was used in ^29j 
to obtain a sparse version of (17): with emn in the right hand side of (17) replaced by 
£||y4||cMt and s = O^l/e"^) (importantly, here s is independent of m,n). 

We have indicated above how the cut norm approximation problem is relevant to Kannan- 
Frieze matrix decompositions, but we did not indicate the uses of such decompositions since 
this is beyond the scope of the current survey. We refer to [3H1 El [151 EH] for a variety of 
applications of this methodology to combinatorial optimization problems. 

2.1.3. Maximum acyclic subgraph. In the maximum acyclic subgraph problem we are given 
as input an n-vertex directed graph G = ({1, . . . ,n},E). Thus E consists of a family of 
ordered pairs of distinct elements in {1, . . . , n}. We are interested in the maximum of 



|{(i,j) G {l,...,n} 



2 . 



< aij)} n El - G {1, . . . , n}2 : a(0 > a(j)} H E\ 



over all possible permutations a G {Sn denotes the group of permutations of {1, ... , n}). 
In words, the quantity of interest is the maximum over all orderings of the vertices of the 
number of edges going "forward" minus the number of edges going "backward". The best 
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known approximation algorithm for this problem was discovered in [26] as an application of 
the cut norm approximation algorithm. 

It is most natural to explain the algorithm of |2S] for a weighted version of the maximum 
acyclic subgraph problem. Let : {1, . . . , n} x {1, . . . , n} — )■ M be skew symmetric, i.e., 
W{u, v) = —W{v, u) for all m, t> G {1, . . . , n}. For a E Sn define 

u,t)£{l,...,n} 

u<v 

Thus W{a) is the sum of the entries of W that lie above the diagonal after the rows and 
columns of W have been permuted according to the permutation a. We are interested in the 
quantity M]y = maXo-gs^ W{(t). The case of a directed graph G = ({1, . . . ,n},E) described 
above corresponds to the matrix W{u,v) = l{(^u,v)eE} — '^{{v,u)£E}- 

Theorem 2.3 ([26]). The exists a polynomial time algorithm that takes as input annxn skew 
symmetric W : {1, . . . ,n} x {1, . . . , n} — t- M and outputs a permutation a E Sn satisfying 

Wia) > . 

logra 

Proof. The proof below is a slight variant of the reasoning of [26j. By the cut norm approx- 
imation algorithm one can find in polynomial time two subsets S", T C {1, . . . , n} satisfying 

Y,W{u,v)^c\\W\U, (20) 

where c G (0, oo) is a universal constant. Note that we do not need to take the absolute 
value of the left hand side of (20) because W is skew symmetric. Observe also that since W 
is skew symmetric we have J^uvesnT^i'^^'^) ~ ^ therefore 

ueS Me5\T ue5xT uesnT 

dst veT\S vesnT vgt^s 

By replacing the pair of subsets (5, T) by one of {(S'xT, TxS"), {S\T, SflT), {SnT, T\S)}, 
and replacing the constant c is ([20]) by c/3, we may assume without loss of generality that (20 ) 



holds with 5* and T disjoint. Denote R = {1, . . . , n} \ {S Li T) and write 5* = {si, . . . , sis\}, 
T = {ti, . . . ,t\T\} and R = {ri, . . . ,r\R\}, where si < ■■■ < s\s\, ti < ■■■ < t\T\ and 
ri < ■ ■ ■ < r|R|. 

Define two permutations a^, cr^ G Sn as follows. 

if nG{l,...,|5|}, 
ct\u) = { t„_|5| ifue{\S\ + l,...,\S\ + \T\}, 

if ue{\S\ + \T\ + l,...,n}, 




and 

,2 



r\R\-u+i if M G {1, . . . , 
a^{u) = { if u e {\R\ + 1,...,\R\ + \S\}, 

ifue{\R\ + \S\ + l,...,n}. 



^Here, and in what follows, the relations >, < indicate the corresponding inequalities up to an absolute 
factor. The relation x stands for > A <. 
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In words, orders {1, . . . , n} by starting with the elements of S in increasing order, then the 
elements of T in increasing order, and finally the elements of R in increasing order. At the 
same time, cr^ orders {1, . . . , n} by starting with the elements of R in decreasing order, then 
the elements of S in decreasing order, and finally the elements of T in decreasing order. The 
quantity W{a^) + W{a'^) consists of a sum of terms of the form W{u, v) for -u, t> G {1, . . . , n}, 
where if {u, v) e {S x S)U{T xT)U{Rx {1, . . . , n}) then both W{u, v) and W{v, u) appear 
exactly once in this sum, and if (u, f ) G S x T then W{u, v) appears twice in this sum 
and W{y, u) does not appear in this sum at all. Therefore, using the fact that W is skew 
symmetric we have the following identity. 



It follows that for some ^ G {1, 2} we have 



M( 



cr 



c\\W\ 



cut- 



The output of the algorithm will be the permutation cr^, so it suffices to prove that 

Mw 



cut ^ 



logn 



(21) 



We will prove below that 



\W\ 



cut 



> 



logn 



(22) 



u,v&{l,...,n} 



Inequality (21) follows by applying ( p2j ) to W'{u,v) = W{a{u),a{v)) for every a G 5'^- 



To prove (|22|) first note that \\W\\cut 



I OO— ^-l ) 



we have already proved this inequality 



as a consequence of the simple identity ( |11[ ). Moreover, we have 

n n 

^^W^(w,t;)sin(a,-/3„) : {a^Tu=xA&X=x ^ 



W\ 



oo— >-l 



> 



max 



(23) 



u=\ v=\ 



Inequality (23 ) is a special case of ([T]) with the choice of vectors Xu = (sin au, cos a„) G and 
Uy = {cos f3y, — sm(3y) G M^. We note that this two-dimensional version of the Grothendieck 



inequality is trivial with the constant in the right hand side of (23) being ^, and it is shown 



in [78j that the best constant in the right hand side of (23) is actually 

For every 9i, . . . ,9n G M, an application of (23) when = /3„ = 9^ and = = —9^ 
yields the inequality 



\W\ 



cut 



> 



n n 

W{u, v) sin I 

u=l v=l 



J2 W {u,v) sin {9u-9y 



u,v(^{l,...,n} 



(24) 
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where for the equahty in (24) we used the fact that W is skew symmetric. Consequently, for 
every G N we have 



cut 



> 



W{u, v) sin ^ 



7r(f — u)k 



ti,t)G{l,...,n} 



n 



(25) 



By the standard orthogonahty relation for the sine function, for every u^v G {1, . . . ,n} 
such that u < V we have 

k=i e=i ^ / \ / 



Readers who are unfamiliar with ( [26| ) are referred to its derivation in the appendix of 
it can be proved by substituting sin {n {v — u)k/n) = (^e«'r(''-")'=/" _ e~'"'(*'~")^/")/(2i) and 
sin(7rH/n) = (e*'"'^^/" — e~'^'^^^/^) / {2i) into the left hand side of (26) and computing the 



resulting geometric sums explicitly. Now, 



u,De|l,...,n| M,t;e|l,...,n} fc=l £=1 ^ 



7r(i; — u)k\ ( nki 
sin 



ue{l,...,n} 



6{l,...,n} 



n 



n 



n 



n-1 

E 

k=l 



n-1 



^sin 



n 



(m, f ) sin 



E::;iEf/sin(^)i 

n 



'!i,D£{l,...,n} 



cut • 



7r(t' — u)k 



n 



Hence, the desired inequality (22) will follow from X]fc=i |X]"=i ^in (vrH/n) | < nlogri. To 
establish this estimate observe that by writing sm{nki/n) = (e*'^'^^/" — e~*'^'^^/")/(2i) and 



computing geometric sums explicitly, one sees that X]"=i sin {irki/n) = if A; is even and 
Y^^=i sin {nkl/n) = cot(7r/c/(2n)) if k is odd (see the appendix of [26] for the details of this 
computation). Hence, since cot(6') < 1/6 for every 6 G (0,7r/2), we have 



n-1 

E 

k=l 



n-1 



sm 



TTke 



n 



E 

j=0 



cot 



vr(2j + 1) 
2n 



2n 



71 



li-ij 

E 



2j + l 



< 



n logn 



□ 



2.1.4. Linear equations modulo 2. Consider a system S of N linear equations modulo 2 in 
n Boolean variables zi, . . . , Zn such that in each equation appear only three distinct vari- 
ables. Let MAXSAT(£^) be the maximum number of equations in S that can be satisfied 
simultaneously. A random {0, 1} assignment of these variables satisfies in expectation A^/2 
equations, so it is natural to ask for a polynomial time approximation algorithm to the quan- 
tity MAXSAT(£^) — A^/2. We describe below the best known [65j approximation algorithm 
for this problem, which uses the Grothendieck inequality in a crucial way. The approxima- 
tion guarantee thus obtained is 0{^yn/\ogn). While this allows for a large error, it is shown 
in [52] that for every e G (0, 1) if there were a polynomial time algorithm that approximates 
MAXSAT(^) - A^/2 to within a factor of 2(i°g")'~' in time 2(^°s'^)°'" then there would be an 
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algorithm for 3-colorability that runs in time 2^^°^'^^°^^\ a conclusion which is widely believed 
to be impossible. 

Let £^ be a system of linear equations as described above. Write aijk = 1 if the equation 
Zi + Zj + Zk = is in the system S. Similarly write aijk = — 1 if the equation Zi + Zj + Zk = 1 
is in Finally, write atjk = if no equation in S corresponds to Zi + zj + Zk- Assume that 
the assignment {zi, . . . , z„) G {0, 1}" satisfies m of the equations in £. Then 

E E E ^^A-^r^''^^' =m-iN-m) = 2(m--\. 

i=l 3=1 k=l ^ ^ 

It follows that 



max 



{n n n ^ . 

E E E ■■ {^^}r=i ^ 1} f = 2 ( MAXSAT(£:) - y ) = M. (27) 

i=i j=i k=i J ^ ^ 



We will now present a randomized polynomial algorithm that outputs a number a G 
which satisfies with probability at least ^, 



1 



20K 



G 



log n 



n 



{2i 



Fix m G N that will be determined later. Choose e^, . . . G {—1, 1}" independently and 
uniformly at random and consider the following random variable. 



a 



^ ( n n n ^ 

— — max max <^ ^ J] J] a,jke'{yj,Zk) : {zk}l=i ^ \ 



(29) 



By the Grothendieck inequality we know that 



a 



^ ( n n n ^ 

^ ^max 5^5^5^a,,,Ma ■ {^^IHi, {^.}"=i, {ClLi ^ {"1,1} ^ M. (30) 
1^ i=i j=i k=i ) 



The final step in (30) follows from an elementary decoupling argument; see [HHl Lem. 2.1]. 
We claim that 



Pr 



a > 



1 



20is:G 



log n 



n 



M 



> 1 



-cm/ ^fn 



(31) 



Once (31) is established, it would follow that for m x ^fn we have a ^ ^\j^^M with 

probability at least \. This combined with (30) would complete the proof of (28) since 
a as defined in (29) can be computed in polynomial time, being the maximum of O {^\fn) 
semidefinite programs. 

To check (31) let || ■ || be the norm on M" defined for every x = {xi, . . . ,Xn) G M" by 



n n n 



2n-l 



= max < E E E (^iJkXiiyj, Zk) : {yj}]=i, {zk}l=i ^ S' 
t i=i j=i k=i 

Define = {x G M" : ||a;|| ^ 1} and let K° = {y e R"" : 

^'^Px^xi^^y) ^ 1} be the polar 
of K. Then max{||?/||i : y G K°} = max{||x|| : ||2;||oo ^ 1} ^ M, where the first equality 
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is straightforward duality and the final inequality is a consequence of the definition of 
and M. It follows that there exists y G K° with ^ M. Hence, 



Pr 



a ^ 



2QKg 



\ogn 



M 



n 



29 



HPr 



1=1 



< 



\ogn 



M 



n 



> 1 



Pr 



1 ^1 /logri 



In order to prove (31) it therefore suffices to prove that if e is chosen uniformly at random 

Er=l ^i^'i ^ A/log n/(4 



1 then Pr 



> 1 



n, 



from {—1, 1}" and a G M"" satisfies ||a||i 

where c G (0, oo) is a universal constant. This probabilistic estimate for i.i.d. Bernoulli sums 
can be proved directly; see [65| Lem. 3.2]. 



2.2. Rounding. Let A = {aij) be an m x n matrix. In Section 2.1 we described a polynomial 
time algorithm for approximating ||^||c?it and ||A||oo-s>i- For applications it is also important 
to find in polynomial time signs ei, . . . , 5^, ^i, • • • , £ 1} for which Xll^i X]j=i (^ij^i^j 
is at least a constant multiple of ||A||oo-s>i- This amounts to a "rounding problem": we 
need to find a procedure that, given vectors Xi, . . . ,Xm,yi, ■ ■ ■ ,yn ^ S"^~^"'~^, produces signs 
El, ... , Em, Si, . . . ,Sn G {—1, 1} whose existence is ensured by the Grothendieck inequality, 
Si^i Si=i (^ij^iSj is at least a constant multiple of YllLi X]j=i (^iji^ij Uj)- For this pur- 
pose one needs to examine proofs of the Grothendieck inequality, as done in [8] . We will now 
describe the rounding procedure that gives the best known approximation guarantee. This 
procedure yields a randomized algorithm that produces the desired signs; it is also possible 
to obtain a deterministic algorithm, as explained in [S]. 

The argument below is based on a clever two-step rounding method due to Krivine [77] . 
Fix G N and assume that we are given two centrally symmetric measurable partitions of 
M'^, or equivalently two odd measurable functions f,g : M'^ — )■ { — 1,1}. Let Gi,G2 G M.^ 
be independent random vectors that are distributed according to the standard Gaussian 

/(27r)'=/2. For t G (-1, 1) define 



measure on 



\ i.e., the measure with density x e 



11^11^/2 



/ 



V2 



Gi]g 



V2 



t2)fc/2 



V2 

f{x)g{y) exp 



Ml - \\y\\l + ^t{x,y) 

l-t2 



dxdy. (32) 



Then Hf g extends to an analytic function on the strip {2; G C : 3?(z) G (—1, 1)}. The pair 
of functions {/, g} is called a Krivine rounding scheme if Hf g is invertible on a neighborhood 



of the origin, and if we consider the Taylor expansion H 



exists c 



c(/,5') e (0,00) satisfying \a2j+i\c^^^^ 



1. 



Sfco '^2j+i2:^"'^^ then there 



For (/,(?) as above and unit vectors {x,}™ ^, C 5"™+" 1^ one can find new unit 

vectors {uj}™ 1, {vj}^^^ C 5"™+"-! satisfying the identities 

V(«, j) G {1, . . . ,m} X {1, . . . {ui,Vj) = Hjl{c{f,g){xi,yj)). (33) 

We refer to [2T] for the proof that {ui}^-^^, {vj}^^-^ exist. This existence proof is not via an 
efficient algorithm, but as explained in [8], once we know that they exist the new vectors 
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can be computed efficiently provided Hj ^ can be computed efficiently; this simply amounts 
to computing a Cholesky decomposition or, alternatively, solving a semidefinite program 



corresponding to (33). This completes the first (preprocessing) step of a generalized Krivine 
rounding procedure. The next step is to apply a random projection to the new vectors thus 
obtained, as in Grothendieck's original proof |15] or the Goemans- Williamson algorithm [42]. 



Let G : 



pm+n 



be a random k x (m + n) matrix whose entries are i.i.d. standard 



Gaussian random variables. Define random signs {ej}™^, {6j}"'^i C {—1, 1} by 



V(2,j) e {l,...,m} X {l,...,n}. 
Now, 



def „ 

£i = f 



V2 



Gui 



and 



f- def 
= 9 



72 



Gv, 



E 



EE 



=1 i=i 



{*) 



E 



= 1 i=l 



m n 

- cif,g)^^aij{xi,yj), 



(34) 



(35) 



=1 i=i 



where (*) follows by rotation invariance from (34) and (32). The identity (35) yields the 
desired polynomial time randomized rounding algorithm, provided one can bound c{f,g) 
from below. It also gives a systematic way to bound the Grothendieck constant from above: 
for every Krivine rounding scheme f,g : M.^ ^ { — 1,1} we have Kq ^ l/c{f,g). Krivine 
used this reasoning to obtain the bound Kg ^ vt/ (2 log (l + V^)) by considering the case 
k = 1 and fo{x) = go{x) = sign(a;). One checks that {/cfi'o} is a Krivine rounding scheme 
with Hfg^gg{t) = ^ arcsin(t) (Grothendieck's identity) and c{fo,go) = ^ log (l + v^). 

Since the goal of the above discussion is to round vectors {xj}™ ^, C 5"™+"-! to 

signs {^i}^!, {Sj}^^i C { — 1, 1}, it seems natural to expect that the best possible Krivine 
rounding scheme occurs when k = 1 and f{x) = g{x) = sign(x). If true, this would imply 
that Kg = tt/ (2 log (l + V^))', a long-standing conjecture of Krivine [77|. Over the years 
additional evidence supporting Krivine's conjecture was discovered, and a natural analytic 
conjecture was made in [76J as a step towards proving it. We will not discuss these topics 
here since in [2Ij it was shown that actually Kg ^ vt/ (2 log (l + "\/2)) —£o for some effective 
constant Eq > 0. 

It is known |211 Lem. 2.4] that among all one dimensional Krivine rounding schemes 
/, (7 : M — 7- { — 1,1} we indeed have c{f,g) ^ |log(l + v^), i.e., it does not pay off to 
take partitions of M which are more complicated than the half-line partitions. Somewhat 
unexpectedly, it was shown in [21] that a certain two dimensional Krivine rounding scheme 
f,g : R"^ ^ {-1,1} satisfies c{f,g) > flog(l + v^). The proof of PI] uses a Krivine 
rounding scheme /, (? : — )■ {—1, 1} when f = g corresponds to the partition of as the 



sub-graph and super-graph of the polynomial y = c{x^ — lOx^ + 15x), where c > is an 
appropriately chosen constant. This partition is depicted in Figure [Tj 

As explained in [211 Sec. 3], there is a natural guess for the "best" two dimensional Krivine 
rounding scheme based on a certain numerical computation which we will not discuss here. 
For this (conjectural) scheme we have f g, and the planar partition corresponding to / 
is depicted in Figure [2j Of course, once Krivine's conjecture has been disproved and the 
usefulness of higher dimensional rounding schemes has been established, there is no reason 
to expect that the situation won't improve as we consider fc-dimensional Krivine rounding 
schemes for A; ^ 3. A positive solution to an analytic question presented in [21] might even 
lead to an exact computation of Kg] see [211 Sec. 3] for the details. 
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Figure 1. The partition of 
used in [21] to show that 
Kg is smaller than Krivine's 
bound; the shaded regions 
are separated by the graph 
y = c {x^ — lOx^ + 15x). 



Figure 2. The "tiger parti- 
tion" restricted to the square 
[-20, 20] 2. This is the con- 
jectured [21] optimal parti- 
tion of for the purpose of 
Krivine-type rounding. 



3. The Grothendieck constant of a graph 

Fix n E N and let G = ({1, . . . , n}, E) be a graph on the vertices {1, . . . , n}. We assume 
throughout that G does not contain any self loops, i.e., E C {S <^ {1, . . . ,n} : \S\ = 2}. 
Following [7], define the Grothendieck constant of G, denoted K{G), to be the smallest 
constant K 6 (0, oo) such that every n x n matrix (ajj) satisfies 

max > aiAxi,Xj)^K max > aijEiSj. (36) 

xi,...,x„e5"-i , , £i,...,£„e{-i,i} , , 

«je|l,...,n} j,jG|l,...,7i| 

Inequality (36) is an extension of the Grothendieck inequality since ([T]) is the special case 
of (36) when G is a bipartite graph. Thus 



Kg = sup {K{G) : G is an n— vertex bipartite graph} . (37) 
The opposite extreme of bipartite graphs is G = Kn-, the n- vertex complete graph. In this 



case ( 36 ) boils down to the following inequality 



max aij{xi,Xj) ^ K{Kn) max aijEiSj. (38) 

i,je{l,...,n} i,je{l,...,n} 

It turns out that K{Kn) x logra. The estimate K{Kn) ^ logra was proved in [911 lOTl [601127] . 
In fact, as shown in [7, Thm. 3.7], the following stronger inequality holds true for every nxn 
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matrix (cijj); it implies that K{Kn) ^ logn by the Cauchy-Schwartz inequahty. 



Xl, 



max 

..,x„g5"-i 



E 



i,je{i, .•■,"} 



X]ie|l,...,n} X] 



} Z^je{l,.--,n}x{i} l"'*^ 



X/je|l,...,n} X/ 



} Z^je{l,...,n}N{j} "ij 




max 

ei,...,£„e{-l,l} 



ije{l,...,n.} 



The matching lower bound K{Kn) ^ logn is due to [7], improving over a result of [60]. 
How can we interpolate between the two extremes (37) and (38)? The Grothendieck 



constant K{G) depends on the combinatorial structure of the graph G, but at present our 
understanding of this dependence is incomplete. The following general bounds are known. 

logo; <i^(G)< log ^9, (39) 

and 

K{G) ^ , (40) 



2 log 



where (39) is due to [7j and (40) is due to [23]. Here u is the chque number of G, i.e., 
the largest k G {2, . . . ,n} such that there exists S* C {1, . . . , n} of cardinality k satisfying 
{i,j} G E for all distinct i,j G S, and 

^ xu...,xn,y e S"" A W{i,j}eE, {xi,x,) = o\. (41) 



mm 



max 



j&{l,...,n} {Xi,y)'^ J 

The parameter is known as the Lovasz theta function of the complement of G; an 
important graph parameter that was introduced in [87]|. We refer to [59J and [71 Thm. 3.5] 
for alternative characterizations of i). It suffices to say here that it was shown in [HT] that 
"(9 ^ X, where x is the chromatic number of G, i.e., the smallest integer k such that there 
exists a partition {Ai, . . . , Ak} of {1, ... , n} such that {i,j} ^ E for all (i, j) G U£=i ^ 
Note that the upper bound in ([39|) is superior to (40) when is large, but when § = 2 the 



bound (40) implies Krivine's classical bound [77] Kq ^ vr/ (2 log (l + \/2))- 

The upper and lower bounds in (39) are known to match up to absolute constants for a 



variety of graph classes. Several such sharp Grothendieck-type inequalities are presented in 
Sections 5.2 and 5.3 of [7] . For example, as explained in [7], it follows from ( [39) ), combined 
with combinatorial results of [HTJ [9], that for every n x n x n 3-tensor (ajjfc) we have 



max 



«J,fc6{l, .■■,"} 



ijk y^ij) 



I ■^jk) 



< 



max 



^ijk^ij^ jk' 



While (39) is often a satisfactory asymptotic evaluation of K{G), this isn't always the 
case. In particular, it is unknown whether K{G) can be bounded from below by a function 
of that tends to oo as ^9 — )■ oo. An instance in which (39) is not sharp is the case of 
Erdos-Renyi [35] random graphs G(n, 1/2). For such graphs we have u x logn almost 
surely as n — )• oo; see [HD] and [IHl Sec. 4.5]. At the same time, for G{n, 1/2) we have |58j 

X ^/n almost surely as — > oo. Thus (39) becomes in this case the rather weak estimate 
log logn < K{G{n, 1/2)) < logn. It turns out [3] that K{G{n, 1/2)) x logn almost surely as 
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n — )■ oo; we refer to [3] for additional computations of this type of the Grothendieck constant 
of random and psuedo-random graphs. An exphcit evaluation of the Grothendieck constant 
of certain graph families can be found in [7S]; for example, if G is a graph of girth g that is 
not a forest and does not admit as a minor then K{G) = SJ^^^i^^, 

3.1. Algorithmic consequences. Other than being a natural variant of the Grothendieck 



inequality, and hence of intrinsic mathematical interest, (36) has ramifications to discrete 
optimization problems, which we now describe. 



3.1.1. Spin glasses. Perhaps the most natural interpretation of (36) is in the context of solid 
state physics, specifically the problem of efficient computation of ground states of Ising spin 
glasses. The graph G represents the interaction pattern of n particles; thus {i,j} ^ E if and 
only if the particles i and j cannot interact with each other. Let atj be the magnitude of 
the interaction of i and j (the sign of aij corresponds to attraction/repulsion). In the Ising 
model each particle i G {1, . . . , n} has a spin £j G {—1,1} and the total energy of the system 
is given by the quantity — J2{ij}eE^ij^i^j- ^ ^P^'^ configuration {ei, . . . G { — 1, 1}" is 
called a ground state if it minimizes the total energy. Thus the problem of finding a ground 



state is precisely that of computing the maximum appearing in the right hand side of (36). 
For more information on this topic see |8H1 pp. 352-355]. 

Physical systems seek to settle at a ground state, and therefore it is natural to ask whether 
it is computationally efficient (i.e., polynomial time computable) to find such a ground state, 
at least approximately. Such questions have been studied in the physics literature for several 
decades; see [IHl [HI [131 122] • In particular, it was shown in [TB] that if G is a planar graph 
then one can find a ground state in polynomial time, but in fT3] it was shown that when G 
is the three dimensional grid then this computational task is NP-hard. 



Since the quantity in the left hand side of (36) is a semi definite program and therefore 
can be computed in polynomial time with arbitrarily good precision, a good bound on 
K{G) yields a polynomial time algorithm that computes the energy of a ground state with 
correspondingly good approximation guarantee. Moreover, as explained in [7], the proof of 



the upper bound in (39) yields a polynomial time algorithm that finds a spin configuration 
(cTi, . . . , an) G { — 1, 1}" for which 

Eaijaiaj > ■ max aijEiSj. (42) 

ijeji, .■•,"} ?.,je|i,...,n| 



An analogous polynomial time algorithm corresponds to the bound (40). These algorithms 
yield the best known efficient methods for computing a ground state of Ising spin glasses on 
a variety of interaction graphs. 



3.1.2. Correlation clustering. A different interpretation of (36) yields the best known poly- 
nomial time approximation algorithm for the correlation clustering problem [111125]; this con- 
nection is due to [27j. Interpret the graph G = ({1, . . . ,n}, E) as the "similarity /dissmilarity 
graph" for the items {1, . . . ,n}, in the following sense. For {i,j} G we are given a sign 
G { — 1,1} which has the following meaning: if aij = 1 then i and j are deemed to be 
similar, and if = —1 then i and j are deemed to be different. If {i,j} ^ E then we do 
not express any judgement on the similarity or dissimilarity of i and j. 
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Assume that Ai, . . . , is a partition (or "clustering") of {1, • • • ,n}. An agreement be- 
tween this clustering and our similarity /dissmilarity judgements is a pair i,j G {1, . . . ,n} 
such that ttij = 1 and i,j G Ar for some r G {1, . . . ,k} or ttij = —1 and i G A^, j G Ag 
for distinct r, s G {1,...,A;}. A disagreement between this clustering and our similar- 
ity/dissmilarity judgements is a pair i,j G {1, . . . ,n} such that aij = 1 and i G Ar, j G 
for distinct r, s G {1, . . . , /c} or a^j = —1 and i,j G for some r G {1, . . . , fc}. Our goal is to 
cluster the items while encouraging agreements and penalizing disagreements. Thus, we wish 
to find a clustering of {1, ... ,n} into an unspecified number of clusters which maximizes the 
total number of agreements minus the total number of disagreements. 

It was proved in [27] that the case of clustering into two parts is the "bottleneck" for this 
problem: if there were a polynomial time algorithm that finds a clustering into two parts 
for which the total number of agreements minus the total number of disagreements is at 
least a fraction a G (0, 1) of the maximum possible (over all bi-partitions) total number of 
agreements minus the total number of disagreements, then one could find in polynomial time 
a clustering which is at least a fraction a/ {2 + a) of the analogous maximum that is defined 
without specifying the number of clusters. 

One checks that the problem of finding a partition into two clusters that maximizes the 
total number of agreements minus the total number of disagreements is the same as the 
problem of computing the maximum in the right hand side of (36). Thus the upper bound 
in (39) yields a polynomial time algorithm for correlation clustering with approximation 
guarantee 0{\og^), which is the best known approximation algorithm for this problem. 
Note that when G is the complete graph then the approximation ratio is O(logn). As 
will be explained in Section [7| it is known [69] that for every 7 G (0, 1/6), if there were a 
polynomial time algorithm for correlation clustering that yields an approximation guarantee 
of (logra)'^ then there would be an algorithm for 3-colorability that runs in time 

2{logn) ( ) 

conclusion which is widely believed to be impossible. 



4. Kernel clustering and the propeller conjecture 

Here we describe a large class of Grothendieck-type inequalities that is motivated by 
algorithmic applications to a combinatorial optimization problem called Kernel Clustering. 
This problem originates in machine learning jllOj , and its only known rigorous approximation 
algorithms follow from Grothendieck inequalities (these algorithms are sharp assuming the 
UGC). We will first describe the inequalities and then the algorithmic application. 

Consider the special case of the Grothendieck inequality ([T]) where A = (aij) is an n x n 
positive semidefinite matrix. In this case we may assume without loss of generality that 
in ([1]) Xj = Ui and Ei = 5i for every i G {1, . . . ,n} since this holds for the maxima on either 
side of ([T| (see also the explanation in P Sec. 5.2]). It follows from [i5 | I107j (see also [95] ) 
that for every n x n symmetric positive semidefinite matrix A = (aij) we have 

n n n n 

1 ^ 9 ■ ,,,YY (43) 

1=1 j=l 1=1 j=l 



and that | is the best possible constant in (43). 



A natural variant of (43 ) is to replace the numbers —1, 1 by general vectors f 1, . . . , f ^ G 



namely one might ask for the smallest constant K G (0, 00) such that for every symmetric 
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positive semidefinite n x n matrix (0^) we have: 

n n n n 

, Yl Yl max V V aij {m ,Uj). (44) 



i = l j = l 



i = l j = l 



The best constant K in (44) can be characterized as follows. Let B = {bij = {vi,Vj)) be the 
Gram matrix of fi, . . . ,Vk- Let C{B) be the maximum over all partitions {Ai, . . . , A^} of 
M^~^ into measurable sets of the quantity X]i=i Xlti ^iji^i: where for i G {1, . . . , k} the 



vector Zi G 



ofe-l ; 



is the Gaussian moment of A,, i.e., 



(27r)(^-i)/2 



xe 



It was proved in [67J that (44) holds with K = 1/C{B) and that this constant is sharp. 

Inequality (44) with K = 1/(7(5) is proved via the following rounding procedure. Fix unit 
vectors Xi, . . . ,x„ G S""^. Let G = (gij) be a (/c — 1) x n random matrix whose entries 
are i.i.d. standard Gaussian random variables. Let Ai,...,Aic C be a measurable 

partition of M^~^ at which C{B) is attained (for a proof that the maximum defining C{B) is 
indeed attained, see [67j). Define a random choice of Ui G {f 1, . . . ,Vk} by setting Ui = V£ for 
the unique £ G {1, . . . , /c} such that Gxi G A^. The fact that (44) holds with K = 1/C{B) is 
a consequence of the following fact, whose proof we skip (the full details are in [6 7j ) . 



E 



=1 i=i 



> C{B)^^aij{xi,Xj). 



(45) 



Determining the partition of M'^ ^ that achieves the value C{B) is a nontrivial problem in 
general, even in the special case when B = I^is the k xk identity matrix. Note that in this 



case one desires a partition {Ai, 
following quantity. 

k 



Ak] of: 



3fc-l 



into measurable sets so as to maximize the 



E 



(27r)(fc- 



l)/2 



xe 



Ml 



/Hx 



As shown in [OniEZ!, the optimal partition is given by simplicial cones centered at the origin. 
When B = I2 we have C{l2) = and the optimal partition of M into two cones is the 
positive and the negative axes. When i? = J3 it was shown in [66] that C(/3) = ^, and the 
optimal partition of into three cones is the propeller partition, i.e., into three cones with 
angular measure 120° each. 

Though it might be surprising at first sight, the authors posed in [66j| the propeller con- 
jecture: for any A; ^ 4, the optimal partition of M^"^ into k parts is "P x M^~^ where V is the 
propeller partition of M?. In other words, even if one is allowed to use k parts, the propeller 
conjecture asserts that the best partition consists of only three nonempty parts. Recently, 
this conjecture was solved positively [53j for /c = 4, i.e., for partitions of M? into four mea- 
surable parts. The proof of [S3] reduces the problem to a concrete finite set of numerical 
inequalities which are then verified with full rigor in a computer-assisted fashion. Note that 
this is the first nontrivial (surprising?) case of the propeller conjecture, i.e., this is the first 
case in which we indeed drop one of the four allowed parts in the optimal partition. 



22 



We now describe an application of (44) to the Kernel Clustering problem; a general frame- 
work for clustering massive statistical data so as to uncover a certain hypothesized struc- 
ture |llUj . The problem is defined as follows. Let A = (aij) be an n x n symmetric positive 
semidefinite matrix which is usually normalized to be centered, i.e., ^2^=1 Sj=i ^ij ~ ^- '^^^ 
matrix A is often thought of as the correlation matrix of random variables [Xi, . . . , X„) that 
measure attributes of certain empirical data, i.e., aij = E [XjXj]. We are also given another 
symmetric positive semidefinite k x k matrix B = {hij) which functions as a hypothesis, or 
test matrix. Think of n as huge and as a small constant. The goal is to cluster A so as 
to obtain a smaller matrix which most resembles B. Formally, we wish to find a partition 
{Si, . . . , Sk] of {!,..., n} so that if we write Cjj = X](pg)eSixSj '^pq then the resulting clus- 
tered version of A has the maximum correlation 'Yl!i=i X]j=i ^ij^ij with the hypothesis matrix 
B. In words, we form a k x k matrix C = {cij) by summing the entries of A over the blocks 
induced by the given partition, and we wish to produce in this way a matrix that is most 
correlated with B. Equivalently, the goal is to evaluate the number: 

k k 

Clust{A\B)= max ^ V ^1 «*i^-«-0> (^6) 

a:{l,...,n}^{l,...,k} ^ ^ 

The strength of this generic clustering framework is based in part on the flexibility of 
adapting the matrix B to the problem at hand. Various particular choices of B lead to well 
studied optimization problems, while other specialized choices of B are based on statistical 
hypotheses which have been applied with some empirical success. We refer to \110\ 166] for 
additional background and a discussion of specific examples. 

In [66j it was shown that there exists a randomized polynomial time algorithm that takes 
as input two positive semidefinite matrices A, B and outputs a number a that satisfies 
C\ust{A\B) ^ E[a] ^ (l + C\ust{A\B). There is no reason to believe that the approxi- 
mation factor of 1 + ^ is sharp, but nevertheless prior to this result, which is based on (44), 
no constant factor polynomial time approximation algorithm for this problem was known. 

Sharper results can be obtained if we assume that the input matrices are normalized 
appropriately. Specifically, assume that k ^ 3 and restrict only to inputs A that are 
centered, i.e., Y17=iYl^=i^ij ~ 0' ^^"^ inputs B that are either the identity matrix /, 



k, 



or satisfy Yli=iYl'j=i^ij = (5 is centered as well) and bu = 1 for all i G {!,..., A;} 
{B is "spherical"). Under these assumptions the output of the algorithm of [HS] satisfies 
C\ust{A\B) ^ E[a] ^ ^ - I) C\ust{A\B). Moreover, it was shown in [US] that assum- 
ing the propeller conjecture and the UGC, no polynomial time algorithm can achieve an 
approximation guarantee that is strictly smaller than ^ (l ~ |) (for input matrices normal- 
ized as above). Since the propeller conjecture is known to hold true for k = 3 [SS] and k = 4 
[53] . we know that the UGC hardness threshold for the above problem is exactly ^ when 
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k = 3 and ^ when = 4. 

A finer, and perhaps more natural, analysis of the kernel clustering problem can be ob- 
tained if we fix the matrix B and let the input be only the matrix A, with the goal being, as 
before, to approximate the quantity Clust(y4|i?) in polynomial time. Since B is symmetric 
and positive semidefinite we can find vectors Vi,...,Vk G M'^ such that B is their Gram 
matrix, i.e., bij = {vi,Vj) for all i,j G {!,... ,k}. Let R{B) be the smallest possible radius 
of a Euclidean ball in which contains {f i, . . . , Vk} and let w{B) be the center of this ball. 
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We note that both R{B) and w{B) can be efficiently computed by solving an appropriate 
semidefinite program. Let C{B) be the parameter defined above. 

It is shown in [67j that for every fixed symmetric positive semidefinite k x k matrix B 
there exists a randomized polynomial time algorithm which given an n x n symmetric positive 
semidefinite centered matrix A, outputs a number Alg(A) such that 

Clust(A|5) ^ E[Alg(A)] ^ ^^^Clust(A|5). 

C{B) 

As we will explain in Section[7], assuming the UGC no polynomial time algorithm can achieve 
an approximation guaranty strictly smaller than R{B)'^/C{B). 

The algorithm of [67] uses semidefinite programming to compute the value 

{n n 
^^aij {xi.Xj) : Xi, . . . ,x„ G A ||a;j||2 ^ 1 Vz G {1, . . . , ra} 
i=i j=i 

{n n 
^^ajj : Xi, . . . ,x„ e S"""^ > , (47) 

i=i j=i ) 

where the last equality in (47) holds since the function (xi, . . . , x„) t-)- Yll=i ^j=i (^*' ^i) 
is convex (by virtue of the fact that A is positive semidefinite). We claim that 

5^ « SDPl.li.) « ,4,) 

which implies that if we output the number i?(i?)^SDP(A|i?) we will obtain a polynomial 
time algorithm which approximates Clust(yl|i?) up to a factor of ^c^b)- verify (48) let 
. . . , X* e S"^"^ and a* : {1, . . . , n} {1, . . . , fc} be such that 

n n n n 

SDP {A\B) = J2Y1 (^^ Clust{A\B) = J2Y1 

1=1 j=l i=l j = l 

Write (aij)"j=i = {{ui.Uj))'^ for some G W^. The assumption that A is 

centered means that Yll=i = 0- The rightmost inequality in (48) is just the Grothendieck 
inequality (44). The leftmost inequality in (48) follows from the fact that 
norm at most 1 for alH G {1, . . . , n}. Indeed, these norm bounds imply that 

n n 

> i=l j=l 

C\ust{A\B) 
R{BY ■ 
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This completes the proof that the above algorithm approximates efficiently the number 
Clust(yl|i?), but does not address the issue of how to efficiently compute an assignment 
a : {1, . . . , n} — )■ {1, . . . ,k} for which the induced clustering of A has the required value. 
The issue here is to find efficiently a conical simplicial partition Ai, Aj. of M'^"^ at which 
C{B) is attained. Such a partition exists and may be assumed to be hardwired into the 
description of the algorithm. Alternately, the partition that achieves C{B) up to a desired 
degree of accuracy can be found by brute-force for fixed k (or k = k{n) growing sufficiently 
slowly as a function of n); see [67]. For large values of k the problem of computing C{B) 
efficiently remains open. 



5. The Lp Grothendieck problem 

Fix p G [1, C)o] and consider the following algorithmic problem. The input is an x n 
matrix A = (aij) whose diagonal entries vanish, and the goal is to compute (or estimate) in 
polynomial time the quantity 

n n n n 

Mp{A)= max = , '^f'i^ J^J^'^'^'^'^r (49) 



The second equality in (49) follows from a straightforward convexity argument since the 
diagonal entries of A vanish. Some of the results described below hold true without the van- 
ishing diagonal assumption, but we will tacitly make this assumption here since the second 
equality in ( [49) ) makes the problem become purely combinatorial when p = oo. Specifically, 
if G = ({1, ... ,n},E) is the complete graph then Moo{A) = maxs-,,...^ene{-i,i} I]{ij}eE aijEiSj. 
The results described in Section [3] therefore imply that there is a polynomial time algorithm 
that approximates Moo{A) up to a O(logn) factor, and that it is computationally hard to 
achieve an approximation guarantee smaller than (logn)''' for all 7 G (0, 1/6). 

There are values of p for which the above problem can be solved in polynomial time. 
When p = 2 the quantity M2{A) is the largest eigenvalue of A, and hence can be computed 
in polynomial time [^[82] . When p = 1 it was shown in [2j| that it is possible to approximate 
Mi{A) up to a factor of 1 -|- e in time n'^^^^^\ It is also shown in [2] that the problem of 
(1 + e)-approximately computing Mi{A) is W[l] complete; we refer to [SS] for the definition 
of this type of hardness result and just say here that it indicates that a running time of 
c{€)n'-^^^^ is impossible. 

The algorithm of [2J proceeds by showing that for every m G N there exist yi, . . . , ?/„ G 
with Yl'i=i IVil ^ 1 and J2'i=i YJj=i (^ijUiVj ^ ~ ^) ^i(^)- The number of such vectors y 
is 1 + Er=iELi2^C)(tD ^ 471™. An exhaustive search over all such vectors will then 
approximate Mi{A) to within a factor of m/{m — 1) in time 0{n"^). To prove the existence 
of y fix ti, . . . , t„ G M with J2l=i \tk\ = l and J2ti EJ=i (^ij^itj = ^i(^)- Let X G be 
a random vector given by Pr [X = sign(t j)ej] = \tj\ for every j G {1, . . . , n}. Here ei, . . . , e„ 
is the standard basis of M". Let {Xg = {Xsi, . . . , Xsn)}^i be independent copies of X 
and set Y = (Yi,...,Yn) = :^J2T=i-^s- Note that if s,t G {!,..., m} are distinct then 
for all i,j G {!,..., n} we have E [XsjXjj] = sign(tj)sign(tj)|tj| ■ \tj\ = titj. Also, for every 
s G {1, . . . , m} and every distinct i,j G {!,..., n} we have XsiXgj = 0. Since the diagonal 
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entries of A vanish it follows that 



E 



.i=i j=i 



J2 Yl [XsiXt,] =(l--) Ml (A). (50) 



s,te{l,...,ni} j,je{l,...,n} 



Noting that the vector Y has ii norm at most 1 and all of its entries are integer multiples of 



1/m, it follows from (50) that with positive probability Y will have the desired properties. 

How can we interpolate between the above results for p G {1,2, oo}? It turns out that 
there is a satisfactory answer for p G (2, oo) but the range p G (1, 2) remains a mystery. To 
explain this write 7p = (E [|G'|p])^''^, where G is a standard Gaussian random variable. One 
computes that 



7p 




(51) 



that 



Also, Stirling's formula implies that 1p = ^ + as j9 — ?■ oo. It follows from 
for every fixed p G [2, oo) there exists a polynomial time algorithm that approximates Mp{A) 
to within a factor of 7^, and that for every e G (0, 1) the existence of a polynomial time 
algorithm that approximates Mp{A) to within a factor 7p — £ would imply that P = NP. 
These results improve over the earlier work [70] which designed a polynomial time algorithm 
for Mp{A) whose approximation guarantee is (1 + o(l))7p as p — )■ 00, and which proved a 
7p — £ hardness results assuming the UGC rather than P 7^ NP. 

The following Grothendieck-type inequality was proved in [92] and independently in 
For every n x n matrix A = [aij 



and every p G [2, 00) we have 



max 



En 
k 



max 



EE 



dijtitj . 



(52) 



The constant % in (52) is sharp. The validity of (52) implies that Mp{A) can be computed 



in polynomial time to within a factor 7^. This follows since the left hand side of (52) is the 



maximum of Yli=i Yl'i=i (^ij^iji which is a linear functional in the variables (Xj^), given the 



constraint that {Xij) is a symmetric positive semidefinite matrix and ^"=1 ^ 1. The 
latter constraint is convex since p ^ 2, and therefore this problem falls into the framework 



of convex programming that was described in Section 1.2 Thus the left hand side of (52) 



can be computed in polynomial time with arbitrarily good precision. 

Choosing the specific value p = 3 in order to illustrate the current satisfactory state of 
affairs concretely, the iVP-hardness threshold of computing maxj-;"_^ l^iP^i ^r=i Si=i ^ij^i^j 
equals 2/ ^/vr. Such a sharp iVP-hardness result (with transcendental hardness ratio) is quite 
remarkable, since it shows that the geometric algorithm presented above probably yields the 
best possible approximation guarantee even when one allows any polynomial time algorithm 
whatsoever. Results of this type have been known to hold under the UGC, but this NP- 
hardness result of |18] seems to be the first time that such an algorithm for a simple to state 
problem was shown to be optimal assuming P 7^ NP. 
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When p G [1,2] one can easily show [92] that 

n n n n 

max > > a,,(x,-, X,) = max > > ajdjti. (53) 



While the identity (53) seems to indicate the problem of computing Mp(A) in polynomial 

time might be easy for p G (1, 2), the above argument fails since the constraint 'YTi=\ -^u'^ ^ 1 
is no longer convex. This is reflected by the fact that despite (53) the problem of (1 + £:)- 
approximately computing Mi{A) is W[l] complete [2]. It remains open whether for p G (1, 2) 
one can approximate Mp{A) in polynomial time up to a factor 0(1), and no hardness of 
approximation result is known for this problem as well. 

Remark 5.1. If p G [2,oo] then for positive semidefinite matrices (ajj) the constant 7^ in 
the right hand side of (52) can be improved [02] to 7".^, where here and in what follows 
P* = ~ !)• For P = 00 this estimate coincides with the classical bound [l5l 1107] that we 
have already encountered in (43), and it is sharp in the entire range p G [2, 00]. Moreover, 
this bound shows that there exists a polynomial time algorithm that takes as input a positive 
semidefinite matrix A and outputs a number that is guaranteed to be within a factor 7"*^ 
of Mp{A). Conversely, the existence of a polynomial time algorithm for this problem whose 
approximation guarantee is strictly smaller than 7",^ would contradict the UGC 



Remark 5.2. The bilinear variant of (52) is an immediate consequence of the Grothendieck 
inequality ([!). Specifically, assume that p, <? G [l,C)o] and Xi, . . . , Xm,yi, ■ ■ ■ ,yn ^ M™"*""" 
satisfy X^^^ I^i|l2 ^ ^ Si=i ll%H2 ^ Write = ||xj||2 and (3j = \\yj\\2- For an m x n 
matrix (aj-,) the Grothendieck inequality provides ei, . . . ,em,Si, . . . ,6n G {—1,1} such that 
Sill X]j=i '^iji^iy Vj) ^ '^j=i (^ijCal^j^i^j- This establishes the following inequality. 



m 



max y^ y^ aij{xi,yj) ^ Kg ■ max ajjSjtj. (54) 

E"=ill%ll^<i E7=il*.r^i 



Observe that the maximum on the right hand side of (|54j) is ; the operator norm of 

A acting as a linear operator from (M™, || ■ ||p) to (M", || ■ ||q*). Moreover, if p, g ^ 2 then the 
left hand side of (54) can be computed in polynomial time. Thus, for p ^ 2 ^ r ^ 1, the 
generalized Grothendieck inequality (54) yields a polynomial time algorithm that takes as 
input an m X n matrix A = (aij) and outputs a number that is guaranteed to be within a 



factor Kg of ||y4||j,_j.r- This algorithmic task has been previously studied in [96] (see also [931 
Sec. 4.3.2]), where for p^2^r^la polynomial time algorithm was designed that 
approximates ||y4||p_5.r up to a factor Sir/ (6v^ — 27r) G [2.293,2.294]. The above argument 
yields the approximation factor Kg < 1.783 as a formal consequence of the Grothendieck 
inequality. The complexity of the problem of approximating ||74||p_s.r has been studied in [l7j, 
where it is shown that if either p^r>2or2>p^r then it is A^P-hard to approximate 
||A||p_^^, up to any constant factor, and unless 3-colorability can be solved in time 2('°s"')°*^', 
for any e G (0, 1) no polynomial time algorithm can approximate ||yl||p_^r up to 

Remark 5.3. Let i^' C be a compact and convex set which is invariant under reflections 
with respect to the coordinate hyperplanes. Denote by Ck the smallest C G (0, 00) such 
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that for every n x n matrix (0^) we have 



max 

a;i,...,a;„eR" 

||xi||2,.--,l|a;n||2)e-ftr 



'^^'^^aij{xi, Xj) ^ C 



max 



(55) 



Such generahzed Grothendieck inequahties are investigated in |92j, where bounds on Ck are 
obtained under certain geometric assumptions on K. These assumptions are easy to verify 
when K = {x E M"' : \\x\\p ^ 1}, yielding (52). More subtle inequalities of this type for 
other convex bodies K are discussed in [92], but we will not describe them here. The natural 
bilinear version of (55) is: if K O and L C M" are compact and convex sets that are 



invariant under reflections with respect to the coordinate hyperplanes then let Ck,l denote 
the smallest constant C G (0, 00) such that for every m x n matrix (ajj) we have 



y y aij{xi,yj)^C max 



max 



ll^l||2,- 



\\x 

.\\ynh)&L 



{si,. 



.,in)eL 



i=i j=i 



(56) 



The argument in Remark 5.2 shows that Ck,l ^ Kq- Under certain geometric assumptions 



on K, L this bound can be improved [92] . 



6. Higher rank Grothendieck inequalities 

We have already seen several variants of the classical Grothendieck inequality ([T|, in- 
cluding the Grothendieck inequality for graphs (36), the variant of the positive semidefinite 



Grothendieck inequality arising from the Kernel Clustering problem (44), and Grothendieck 



inequalities for convex bodies other than the cube (52), (54), (55), (56). The literature con- 



tains additional variants of the Grothendieck inequality, some of which will be described in 
this section. 

Let G = ({1, . . . ,n}, E) be a graph and fix g, r G N. Following [23], let K{q — r, G) be 



the smallest constant K G (0, 00) such that for every n x n matrix A 



we have 



max 

..,x„GS9- 



E 



x,)^K 



■■.,je{i,...,n} 



max 

..,»/nG5'- 



E 



i{yi,yj)- 



(57) 



{i,j}(^E 



Set also K{r,G) = sup^gi^i^r(g — r,G). We similarly define K^{q — )■ r,G) to be the 
smallest constant K G (0, 00) satisfying (57) for all positive semidefinite matrices A, and 
correspondingly K~^{r,G) = sup^gj^ K+{q r,G). 

To link these definitions to what we have already seen in this article, observe that Kq is 
the supremum of K{1, G) over all finite bipartite graphs G, and due to the results described 
in Section m we have 

sup/r+(r,ir^) = sup sup —- \- — ^, (58) 

where K'^ is the complete graph on n- vertices with self loops. Recall that the definition of 



C{B) for a positive semidefinite matrix B is given in the paragraph following (44). 

The most important special case of (57) is when r = 2, since the supremum of K{2,G) 



over all finite bipartite graphs G, denoted Kq, is the complex Grothendieck constant, a 
fundamental quantity whose value has been investigated in [15], IHSl [991 EOl [71]. The best 



28 



known bounds on are 1.338 < < 1.4049; see [Ml Sec. 4] for more information on 
this topic. We also refer to [321 1113j for information of the constants K{2q — )■ 2, G) where 
G is a bipartite graph. The supremum of K{q — )■ r, G) over all biparpite graphs G was 
investigated in [7S] for r = 1 and in [71] for r = 2; see also [75] for a unified treatment of 
these cases. The higher rank constants K{q — )■ r, G) when G is bipartite were introduced 



in [22]. Definition (57) in full generality is due to [23] where several estimates on K{q — )■ r, G) 
are given. One of the motivations of [23] is the case r = 3 (and G a subgraph of the grid Z^), 
based on the connection to the polynomial time approximation of ground states of spin glasses 



as described in Section 3.1.1 the case r = 1 was discussed in Section 3.1.1 in connection with 



the Ising model, but the case r = 3 corresponds to the more physically realistic Heisenberg 



model of vector-valued spins. The parameter sup^^^ [r, K'^) (recall (58)) was studied 
in [22j in the context of quantum information theory, and in [21] it was shown that 

»'"nV r(n/2) ) 2 4n^ \ 



71^ 



(59) 



and 



We refer to [2^ for a corresponding UGC hardness result. Note that (59) improves over (43) 
for fixed n G N. 

7. Hardness of approximation 

We have seen examples of how Grothendieck-type inequalities yield upper bounds on 
the best possible polynomial time approximation ratio of certain optimization problems. 
From the algorithmic and computational complexity viewpoint it is interesting to prove 
computational lower bounds as well, i.e., results that rule out the existence of efficient 
algorithms achieving a certain approximation guarantee. Such results are known as hardness 



or inapproximability results, and as explained in Section |1.1[ at present the state of the art 
allows one to prove such results while relying on complexity theoretic assumptions such as 
P 7^ NP or the Unique Games Conjecture. A nice feature of the known hardness results 
for problems in which a Grothendieck-type inequality has been applied is that often the 
hardness results (lower bounds) exactly match the approximation ratios (upper bounds). In 
this section we briefly review the known hardness results for optimization problems associated 
with Grothendieck-type inequalities. 

Let Kn^n-QP denote the optimization problem associated with the classical Grothendieck 
inequality (the acronym QP stands for "quadratic programming"). Thus, in the problem 
Kn,n-QP we are given an n x n real matrix (ajj) and the goal is to determine the quantity 



m 



max 



=1 j=i 

As explained in [8j, the MAX DICUT problem can be framed as a special case of the 
problem Kn^n-QP- Hence, as a consequence of [Slj, we know that for every e G (0, 1), assum- 
ing P 7^ NP there is no polynomial time algorithm that approximates the K^^n-QP problem 
within ratio — In [HE] it is shown that the lower bound ^ on the Grothendieck constant 
can be translated into a hardness result, albeit relying on the Unique Games Conjecture. 
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Namely, letting rio be as in ([3]), for every e E (0, 1) assuming the UGC there is no polynomial 
time algorithm that approximates the „-QP problem within a ratio |e''o — e. 

We note that all the hardness results cited here rely on the well-known paradigm of 
dictatorship testing. A lower bound on the integrality gap of a semidefinite program, such 
as the estimate Kq ^ fe'^o, can be translated into a probabilistic test to check whether a 
function / : { — 1, l}*^ t-)- { — 1, 1} is a dictatorship, i.e., of the form /(x) = Xi for some fixed 
i G {1, . . . ,n}. If / is indeed a dictatorship, then the test passes with probability c and if 
/ is "far from a dictator" (in a formal sense that we do not describe here), the test passes 
with probability at most s. The ratio c/s corresponds exactly to the UGC-based hardness 
lower bound. It is well-known how to prove a UGC-based hardness result once we have the 
appropriate dictatorship test; see the survey [63] . 

The above quoted result of [68] relied on explicitly knowing the lower bound construc- 
tion [I05j leading to the estimate Kg ^ |e''o. On the other hand, in [1U4] . building on 
the earlier work [103] . it is shown that any lower bound on the Grothedieck constant can 
be translated into a UGC-based hardness result, even without explicitly knowing the con- 
struction! Thus, modulo the UGC, the best polynomial time algorithm to approximate the 
Kn,n-QP problem is via the Grothendieck inequality, even though we do not know the precise 
value of Kq. Formally, for every e G (0, 1), assuming the UGC there is no polynomial time 
algorithm that approximates the Kn^n-QP problem within a factor Kg — e. 

Let Kn,n-QPpsD be the special case of the Kn,n-QP problem where the input matrix (ajj) is 
assumed to be positive semidefinite. By considering matrices that are Laplacians of graphs 
one sees that the MAX CUT problem is a special case of the problem -ft'n.n-QPpsD (see [66j). 
Hence, due to [51], we know that for every e G (0,1), assuming P ^ NP there is no 
polynomial time algorithm that approximates the -ft"ra,n-QPpsD problem within ratio — e. 
Moreover, it is proved in [66] that for every e G (0,1), assuming the UGC there is no 
polynomial time algorithm that approximates the -ft'n.n-QPpsD problem within ratio ^ — s, 



an optimal hardness result due to the positive semidefinite Grothendieck inequality (43). 
This follows from the more general results for the Kernel Clustering problem described later. 

Let (aij) be an n X n real matrix with zeroes on the diagonal. The Kn-QP problem seeks 
to determine the quantity 



max 



^J]a,,e,5, :{e,}Il,C{-l,l} 
=1 j=i 



In [SI] it is proved that for every 7 G (0, 1/6), assuming that NP does not have a 2(^°s")°'^^ 
time deterministic algorithm, there is no polynomial time algorithm that approximates the 
Kn-QP problem within ratio (logn)"^. This improves over [12] where a hardness factor of 
(logn)^ was proved, under the same complexity assumption, for an unspecified universal 
constant c > 0. Recall that, as explained in Section |3| there is an algorithm for Kn-QP 
that achieves a ratio of O(logn), so there remains an asymptotic gap in our understanding 
of the complexity of the i^„-QP problem. For the maximum acyclic subgraph problem. 



as discussed in Section |2.1.3[ the gap between the upper and lower bounds is even larger. 
We have already seen that an approximation factor of O(logn) is achievable, but from the 
hardness perspective we know due to [HZ] that there exists £0 > such that assuming 
P 7^ NP there is no polynomial time algorithm for the maximum acyclic subgraph problem 
that achieves an approximation ratio less than 1 + Eq- In [37] it was shown that assuming 
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the UGC there is no polynomial time algorithm for the maximum acyclic subgraph problem 
that achieves any constant approximation ratio. 

Fix p G (0,00). As discussed in Section |5| the Lp Grothendieck problem is as follows. 
Given an n x n real matrix A = (aij) with zeros on the diagonal, the goal is to determine 



the quantity Mp{A) defined in (49). For p G (2, 00) it was shown in that for every 
e G (0, 1), assuming P 7^ NP there is no polynomial time algorithm that approximates the 
Lp Grothendieck problem within a ratio 7p — e. Here 7p is defined as in (51). This result 



(nontrivially) builds on the previous result of [70] that obtained the same conclusion while 
assuming the UGC rather than P ^ NP. 

For the Kernel Clustering problem with a k x k hypothesis matrix B, an optimal hardness 
result is obtained in [HT] in terms of the parameters R{B) and C{B) described in Section |4j 
Specifically for a fixed kxk symmetric positive semidefinite matrix B and for every e G (0, 1), 
assuming the UGC there is no polynomial time algorithm that, given an n x n matrix 
A approximates the quantity Clust(A|i?) within ratio ^^^-j — When B = Ik is the 
kxk identity matrix, the following hardness result is obtained in [66]. Let e > be an 
arbitrarily small constant. Assuming the UGC, there is no polynomial time algorithm that 
approximates Clust(A|J2) within ratio ^ — e. Similarly, assuming the UGC there is no 
polynomial time algorithm that approximates Clust(y4|J3) within ratio ^ — e, and, using 
also the solution of the propeller conjecture in given in [33], there is no polynomial time 
algorithm that approximates Clust(^|/4) within ratio ^ ~ ^- Furthermore, for ^ 5, 
assuming the propeller conjecture and the UGC, there is no polynomial time algorithm that 
approximates Clust(A|/fc) within ratio ^ (l — |) — £. 

Acknowledgements. We are grateful to Oded Regev for many helpful suggestions. 
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